16 research outputs found

    Effect of Pore-Water Surface Tension on Tensile Strength of Unsaturated Sand

    Get PDF
    A custom-built direct tension apparatus was employed to perform direct tension tests on unsaturated silica sand specimens at different saturations levels and packing dry densities. Attempt was made to understand the effect of surface tension of pore-liquid. It was found that the tensile strength decreases, as the surface tension of the pore-liquid decreases. However, tensile strength does not decrease as a simple multiple of ratio of surface tension of pore-liquid. The experimental results were also compared with the predicted results from two theoretical tensile strength models. Results predicted using the micro-mechanical model agreed well with the experimental results, but only for specimens containing distilled water within the pendular regime. On the other hand, the macro-mechanical model followed the experimental trend across pendular and funicular regimes for specimens containing distilled water reasonably well. However, at reduced surface tension of pore-liquid, both models significantly under-predicted the experimental results

    Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives

    Get PDF
    ABSTRACT Objective This paper presents a coreference resolution system for clinical narratives. Coreference resolution aims at clustering all mentions in a single document to coherent entities. Materials and Methods We employ a knowledge-intensive approach for coreference resolution. The domain knowledge we use includes several domain specific lists, a knowledge intensive mention parsing and task informed discourse model. Mention parsing allows us to abstract over the surface form of the mention and represent each mention using a higher level representation which we call the mention's Semantic Representation (SR). SR reduces the mention to a standard form and hence better support comparing and matching. Existing coreference resolution systems tend to ignore discourse aspects and rely heavily on lexical and structural cues in the text. We break from this tradition and present a discourse model for "person" type mentions in clinical narratives which greatly simplifies the coreference resolution. Results We evaluated our system on 4 different datasets which were made available in the 2011 i2b2/VA coreference challenge. The unweighted average of F1 scores (over B-cubed, MUC and CEAF) varied from 84.2 to 88.1%. Our experiments show that domain knowledge proved to be very effective for different mention types for all the datasets. Discussion Our error analysis shows that most of the recall errors made by the system can be handled by further addition of domain knowledge. The precision errors, on the other hand, are more subtle and indicate the necessity to understand the relations in which mentions participate for building a robust coreference system. Conclusion This paper presents an approach that makes an extensive use of domain knowledge to significantly improve coreference resolution. On the acceptance of our paper in the journal, we would make our system and the knowledge sources developed publicly available

    Extraction of events and temporal expressions from clinical narratives

    Get PDF
    AbstractThis paper addresses an important task of event and timex extraction from clinical narratives in context of the i2b2 2012 challenge. State-of-the-art approaches for event extraction use a multi-class classifier for finding the event types. However, such approaches consider each event in isolation. In this paper, we present a sentence-level inference strategy which enforces consistency constraints on attributes of those events which appear close to one another. Our approach is general and can be used for other tasks as well. We also design novel features like clinical descriptors (from medical ontologies) which encode a lot of useful information about the concepts. For timex extraction, we adapt a state-of-the-art system, HeidelTime, for use in clinical narratives and also develop several rules which complement HeidelTime. We also give a robust algorithm for date extraction. For the event extraction task, we achieved an overall F1 score of 0.71 for determining span of the events along with their attributes. For the timex extraction task, we achieved an F1 score of 0.79 for determining span of the temporal expressions. We present detailed error analysis of our system and also point out some factors which can help to improve its accuracy

    Distributed Strategies for Topic Modeling

    Get PDF
    Topic modeling algorithms (like Latent Dirichlet Allocation) tend to be very slow when run over large document collections. In this presentation, we discuss distributed strategies for topic modeling. We use Charm++ as our parallelization framework. Our results show that parallelization can considerably increase the efficiency of topic modeling.unpublishednot peer reviewe

    Argobots: A Lightweight Low-Level Threading and Tasking Framework

    Get PDF
    In the past few decades, a number of user-level threading and tasking models have been proposed in the literature to address the shortcomings of OS-level threads, primarily with respect to cost and flexibility. Current state-of-the-art user-level threading and tasking models, however, either are too specific to applications or architectures or are not as powerful or flexible. In this paper, we present Argobots, a lightweight, low-level threading and tasking framework that is designed as a portable and performant substrate for high-level programming models or runtime systems. Argobots offers a carefully designed execution model that balances generality of functionality with providing a rich set of controls to allow specialization by end users or high-level programming models. We describe the design, implementation, and performance characterization of Argobots and present integrations with three high-level models: OpenMP, MPI, and colocated I/O services. Evaluations show that (1) Argobots, while providing richer capabilities, is competitive with existing simpler generic threading runtimes; (2) our OpenMP runtime offers more efficient interoperability capabilities than production OpenMP runtimes do; (3) when MPI interoperates with Argobots instead of Pthreads, it enjoys reduced synchronization costs and better latency-hiding capabilities; and (4) I/O services with Argobots reduce interference with colocated applications while achieving performance competitive with that of a Pthreads approach

    Information extraction for clinical narratives

    Get PDF
    Recent US government initiatives have made available a large number of Electronic Health Records (EHRs). These EHRs contain valuable information which can be used in Clinical Decision Support (CDS). So, Information Extraction (IE) from EHRs is a very promising research area. In this thesis, I focus on two tasks namely Mention Detection and Coreference Resolution. A lot of domain knowledge is available regarding clinical narratives. There are also several tools like SpecialistLexicalTools, MetaMap, etc. which help in analyzing clinical narratives. I integrate the domain knowledge and features derived from these tools in the local statistical models. Clinical narratives have a very special format which gives several interconnections between the tasks of mention detection and coreference resolution. A joint formulation for these two tasks has been presented in this thesis. Along with this, there is also a discussion regarding joint formulation for finding the mention types together. Soft constraints have been used while formulating the inference tasks. Softening the constraints is helpful because it allows the constraints to be violated during inference. Joint formulation is based on the fact that only local models are learned in the training phase. Inconsistencies in the decisions based on local models are resolved during the global inference step. I report the best results, to date, on end-to-end coreference resolution. The joint formulation presented in this thesis is very general and would benefit other information extraction tasks as well. I have made the systems described in this thesis publicly available for research use

    Detecting Privacy-Sensitive Events in Medical Text

    Get PDF
    Recent US government initiatives have led to wide adoption of Electronic Health Records (EHRs). More and more health care institutions are storing patients' data in an electronic format. This emerging practice is posing several security-related risks because electronic data can easily be shared within and across institutions. So, it is important to design robust frameworks which will protect patients' privacy. In this report, we present a method to detect security-related (particularly drug abuse) events in medical text. Several applications can use this information to make the hospital systems more secure. For example, portions of the clinical reports which contain description of critical events can be encrypted so that it can be viewed only by selected individuals.HHS 90TR0003/01unpublishednot peer reviewe

    Efficient Development of Parallel NLP Applications

    Get PDF
    Parallel programming is becoming increasingly popular. Computers have increasingly many cores (processors). Also, large computer-clusters are becoming available. But there is still no good programming framework for these architectures, and thus no simple and unified way for NLP applications to take advantage of the potential speed up. In this paper, we develop a broadly applicable parallel programming method to solve NLP problems. Our work is in distinct contrast to the tradition of designing (often ingenious) ways to speed up a single algorithm at a time. Specifically, we show how the problems which can be expressed in LBJ framework take advantage of parallelization. We use Charm++ to demonstrate the speed up of NLP applications.unpublishednot peer reviewe
    corecore